DocEdit: Language-Guided Document Editing

نویسندگان

چکیده

Professional document editing tools require a certain level of expertise to perform complex edit operations. To make accessible increasingly novice users, we investigate intelligent assistant systems that can or suggest edits based on user's natural language request. Such system should be able understand the ambiguous requests and contextualize them visual cues textual content found in image localized unstructured text structured layouts. this end, propose new task language-guided editing, where user provides an open vocabulary request, produces command used automate real-world software. In support task, curate DocEdit dataset, collection approximately 28K instances over PDF design templates along with their corresponding ground truth software executable commands. our knowledge, is first dataset diverse mix operations direct indirect references embedded objects such as paragraphs, lists, tables, etc. We also DocEditor, Transformer-based localization-aware multimodal (textual, spatial, visual) model performs task. The attends both related contents which may referred generating embedding predict associated bounding box localizing it. Our proposed empirically outperforms other baseline deep learning approaches by 15-18%, providing strong starting point for future work.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimedia Document Authoring using WYSIWYM editing

This paper outlines a future ideal multimedia document authoring system that would allow authors to specify content and form of the document independently of each other and at a high level of abstraction One of the main challenges in a system of this kind is to ensure the coherence of the generated document which implies among other things that the information expressed by the di erent media us...

متن کامل

CCDES - a collaborative compound document editing

A collaborative editing systems allows co-authors at different locations to edit a shared view of a single document simultaneously. A compound document binds various types of information to create a single seamless presentation. A collaborative compound document editing system is developed to combine both the systems described above. It supports distributed editing with replicated compound docu...

متن کامل

Initial Steps Toward Automating Legal Document Editing

Drafting successful legal documents requires experienced lawyers, careful research, and nuanced reasoning. Nonetheless, drafting also requires many mundane and time intensive tasks that delay the process and increase its cost. One consequence of this is that law firms hire a large number of support employees to assist with these low-level tasks. According to US Bureau of Labor Statistics data r...

متن کامل

Topological clustering guided document binarization

The current approach for text binarization proposes a clustering algorithm as a preprocessing stage to an energy-based segmentation method. It uses a clustering algorithm to obtain a coarse estimate of the background (BG) and foreground (FG) pixels. These estimates are used as a prior for the source and sink points of a graph cut implementation, which is used to efficiently find the minimum ene...

متن کامل

Guided Interactive Volume Editing in Medicine

Various medical imaging techniques, such as Computed Tomography, Magnetic Resonance Imaging, Ultrasonic Imaging, are now gold standards in the diagnosis of different diseases. The diagnostic process can be greatly improved with the aid of automatic and interactive analysis tools, which, however, require certain prerequisites in order to operate. Such analysis tools can, for example, be used for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i2.25282